Theano-based Large-Scale Visual Recognition with Multiple GPUs

نویسندگان

  • Weiguang Ding
  • Ruoyan Wang
  • Fei Mao
  • Graham W. Taylor
چکیده

In this report, we describe a Theano-based AlexNet (Krizhevsky et al., 2012) implementation and its naive data parallelism on multiple GPUs. Our performance on 2 GPUs is comparable with the state-of-art Caffe library (Jia et al., 2014) run on 1 GPU. To the best of our knowledge, this is the first open-source Python-based AlexNet implementation to-date.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using GPUs in Machine Learning

Machine learning algorithms have been known to perform better with more free parameters to tune and more training data. But learning algorithms are often too slow for large scale applications and thus the size of training models (free parameters) and data is limited in practice. So usage of GPUs to improve the speeds of these algorithms has attracted a lot of attention recently. I focused on a ...

متن کامل

Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines

Deep learning models, which learn high-level feature representations from raw data, have become popular for machine learning and artificial intelligence tasks that involve images, audio, and other forms of complex data. A number of software “frameworks” have been developed to expedite the process of designing and training deep neural networks, such as Caffe [11], Torch [4], and Theano [1]. Curr...

متن کامل

Theano-MPI: A Theano-Based Distributed Training Framework

We develop a scalable and extendable training framework that can utilize GPUs across nodes in a cluster and accelerate the training of deep learning models based on data parallelism. Both synchronous and asynchronous training are implemented in our framework, where parameter exchange among GPUs is based on CUDA-aware MPI. In this report, we analyze the convergence and capability of the framewor...

متن کامل

Synkhronos: a Multi-GPU Theano Extension for Data Parallelism

We present Synkhronos, an extension to Theano for multi-GPU computations leveraging data parallelism. Our framework provides automated execution and synchronization across devices, allowing users to continue to write serial programs without risk of race conditions. The NVIDIA Collective Communication Library is used for high-bandwidth inter-GPU communication. Further enhancements to the Theano ...

متن کامل

Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN

The Kaldi 1 toolkit is becoming popular for constructing automated speech recognition (ASR) systems. Meanwhile, in recent years, deep neural networks (DNNs) have shown state-of-the-art performance on various ASR tasks. This document describes our recipes to implement fully-fledged DNN acoustic modeling using Kaldi and PDNN. PDNN is a lightweight deep learning toolkit developed under the Theano ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1412.2302  شماره 

صفحات  -

تاریخ انتشار 2014